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Abstract 

Erasure codes are being increasingly used in distributed-storage systems in place of data-replication, since they 
provide the same level of reliability with much lower storage overhead. We consider the problem of constructing explicit 
erasure codes for distributed storage with the following desirable properties motivated by practice; (i) Maximum- 
Distance-Separable (MDS)'. to provide maximal reliability at minimum storage overhead, (ii) Optimal repair-bandwidth: 
to minimize the amount of data needed to be transferred to repair a failed node from remaining ones, (iii) Flexibility in 
repair: to allow maximal flexibility in selecting subset of nodes to use for repair, which includes not requiring that all 
surviving nodes be used for repair, (iv) Systematic Form: to ensure that the original data exists in uncoded form, and 
(v) Fast encoding: to minimize the cost of generating encoded data (enabled by a sparse generator matrix). Existing 
constructions in the literature satisfy only strict subsets of these desired properties. 

This paper presents the first explicit code construction which theoretically guarantees all the five desired properties 
simultaneously. Our construction builds on a powerful class of codes called Product-Matrix (PM) codes. PM codes satisfy 
properties (i)-(iii), and either (iv) or (v), but not both simultaneously. Indeed, native PM codes have inherent structure 
that leads to sparsity, but this structure is destroyed when the codes are made systematic. We first present an analytical 
framework for understanding the interaction between the design of PM codes and the systematic property. Using this 
framework, we provide an explicit code construction that simultaneously achieves all the above desired properties. We 
also present general ways of transforming existing storage and repair optimal codes to enable fast encoding through 
sparsity. In practice, such sparse codes result in encoding speedup by a factor of about 4 for typical parameters. 

I. Introduction 

Erasure codes are being increasingly used in distributed-storage systems instead of replication, since they provide the 
same level of reliability with much less storage overhead. Large scale distributed-storage systems have many practical 
requirements that guide the design of distributed-storage codes. 


Data Blocks 


Nodes 



Nodes 



(a) 


(b) 


Fig. 1: (a) Encoding and decoding for an [n, k] systematic MDS code, (b) Node repair: Connecting to d = (n — 2) helper 
nodes to repair failed node 1. 


In large-scale systems, storage is a critical resource. For this reason, Maximum-Distance-Separable (MDS) codes such 
as Reed-Solomon codes, which require the minimal storage overhead to achieve a desired level of reliability, are a 
popular choice B-i An [n, k] MDS code allows the data to be stored across n nodes such that the entire data can 
be recovered from the encoded data stored in any k (out of n) nodes. This is depicted in Figure la Another critical 
resource in distributed-storage systems is network bandwidth. In large-scale systems, failures are the norm rather than 
the exception, and repair operations run continuously in the background [^. When nodes fail, they must be repaired by 
downloading some data from the remaining nodes. These nodes are termed helper nodes. Figure [Tb| depicts a repair 
























operation where node 1 is being repaired with the help of nodes {2,... ,n — 1}. In large-scale systems, the repair 
operations consume a significant amount of network bandwidth, and this has been one of main deterrents to using 
classical MDS erasure codes in such systems [^. Hence, it is important for storage codes to also minimize the amount 
of bandwidth consumed during repair. 


Another important system consideration is that the code not force the requirement that all surviving (n — 1) nodes be 
needed to repair a single failed node. If d denotes the number of helper nodes required for repair, then this property 
requires d < (n — 1), as illustrated in Figure lb This property is crucial to allow redundant requests to be sent during 
a repair operation, which is an effective approach to reducing latency in practical systems E-E- That is, a failed node 
can request help from many helpers, and can repair as soon as enough nodes respond. This property is even more 
critical for degraded reads [^, where a repair operation is performed to serve a read request for data stored in a busy or 
otherwise unavailable node. Latency is crucial for degraded reads to meet the service level agreement in large scale 
systems. 


Another practical requirement of storage codes is that of being in systematic form. That is, the original data must 
exist in the system in uncoded form. Figure [Ta] shows a systematic code wherein the first k nodes store the original 
data. This is essential when serving read requests, since if the code is systematic, read requests can be served by simply 
reading the data in systematic nodes. Otherwise the system must perform a decoding operation to retrieve the original 
data for every read request. 


Finally, one of the most frequent operations performed in many distributed-storage systems is the encoding of new data 
entering the system. This encoding cost is a non-issue when using replication, but can be significant when using erasure 
codes. Thus it is desirable for the code to support fast encoding operations. For linear codes, encoding the original data 
can be represented as multiplication between a generator matrix and the data vector 10 . This encoding operation will 
be fast if the generator matrix is sparse, since this reduces the number of computations performed. Informally, the 
sparsity of the generator matrix dictates how many data symbols need to be touched in order to generate each encoded 
symbol. 


This forms the motivation for this paper: to construct storage codes that satisfy all the above system-driven constraints. 
That is, storage codes having the following five properties: (i) Minimum storage for a targeted level of reliability (MDS), 
(ii) Minimal repair bandwidth, (iii) Flexible repair parameters: d < {n — 1), (iv) Systematic form of encoded data, and 
(v) Fast encoding, enabled by a sparse generator matrix. 


There has been considerable interest in the recent past in constructing such erasure codes for distributed storage 
11-15 . However, to the best of our knowledge, all existing constructions in the literature address only a strict subset 


of the above desired properties. This paper presents the first explicit codes which theoretically guarantee all the five 
desired properties simultaneously. 


Our constructions are based on a powerful class of storage codes called Product-Matrix (PM) codes 16 . PM codes are 
MDS[^, and hence are optimal w.r.t. storage overhead. They also have optimal bandwidth consumed during repair, since 
they meet the lower-bound presented in 17 . PM codes belong to a general class of codes known as Regenerating codes 


[17| , which meet this lower-bound. Moreover, PM codes support a wide range of values for d : {2k — 2) < d < (n — 1). 
Finally, the special structure of PM codes makes their generator matrix sparse, leading to fast encoding [^. Thus PM 
codes satisfy Properties (i)-(iii) and (v). 


Native PM codes, however, are not systematic. They can be converted to systematic form using a generic transformation 
termed “systematic-remapping” 16 . However, this remapping does not respect the inherent structure of PM codes. 


and thus often destroys its sparsity. Thus naively performing a remapping transform causes PM codes to be systematic, 
at the expense of fast encoding. For example, an [n, fc, d = 2fc — 1] PM code requires a block-length of symbols. That 
is, each stored symbol can be, in general, a function of up to data symbols. However, due to the sparse structure of 
native PM codes, each stored symbol is a function of only 0{k) of these data symbols. This is no longer true after 
systematic remapping, and in general each parity symbol becomes a (dense) function of symbols. This results in 
significantly higher encoding time for systematic PM codes constructed in this manner [^, [l^. 


In this paper, first, we present an analytical framework for studying and understanding the interaction between the 
design of PM codes and the systematic-remapping transformation. Using this, we provide an explicit construction of 
PM codes which remains sparse after systematic-remapping, for d = {2k — 2). In particular, each parity symbol in this 
construction depends on only d = 0{k) data symbols. 


^ We use “PM codes” here to refer to the MDS version of Product-Matrix codes, termed PM-MSR in [l6| . 

^ Note that for a systematic [n,fc,d] MDS code, a sparsity of at least k symbols is necessary for each encoded symbol in the parity nodes. 










Second, we consider the sparsity of codes supporting repair-by-transfer. A node assisting in a repair operation is said to 
perform repair-by-transfer if it does not perform any computation, and merely transfers one of its stored symbols to 
the failed node 
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Storage codes which support repair-by-transfer are appealing in practice, since they also minimize 
the amount of data read during repairs. There have been a number of works in the recent past on constructing such 
storage codes 20 - 25 . We show that a particular type of repair-by-transfer property leads to sparsity in any MDS 


regenerating code. This provides a general way of constructing sparse MDS regenerating codes. 

Third, using the above result, we construct explicit sparse systematic PM codes for all d > {2k — 2). For example, 
the generator matrix of a [n = 17, k = 8,d = 15] systematic-remapped PM code as in 16 is ~ 11% sparse, while our 
construction is ~ 77% sparse. 

We note that the construction provided in this paper is similar to the codes considered in [21] , wherein the authors 
present codes supporting repair-by-transfer for achieving savings in disk I/O. For d = (2fc —2), the construction provided 
in the present paper is also similar to the recent construction in [19] by Le Scouarnec. In [19] , the author presents a 
sparse PM code and computationally validates its properties for a fixed range of k. In fact, the results presented in this 
paper provide a theoretical proof of sparsity for the constructions in both the above works 


19, 21 


The remainder of this paper is organized as follows: Section jlTj contains a review of Product-Matrix codes, systematic- 
remapping, and other necessary background and notation. Section jlllj contains a motivating example. Section m 
illustrates the main ideas of our approach to understanding sparsity, by showing that a simple form of PM encoding 
matrix leads to partial sparsity. These techniques are extended in Section [^ to give an explicit construction of sparse 
systematic PM codes for d = {2k — 2). In Section VI we consider more general regenerating codes, and show that 


regenerating codes possessing a certain repair-by-transfer property are necessarily sparse. We apply this in Section m 
to construct explicit sparse systematic PM codes for d > {2k — 2). Finally in Section VIIT we show that the two 
presented constructions of sparse PM codes for d = {2k — 2) are in fact equivalent in a certain sense. 


II. Background 


A. Product-Matrix Codes 


Product-Matrix (PM) codes 16 are an explicit family of linear MDS codes which minimize bandwidth consumed in 
repair, and exist for all \n^k,d>2k — 2], 


Let the message to be stored consist of B symbols from the finite field F^. An [n^k,d]{a) PM code allows the message 
to be stored across n nodes, each storing a encoded symbols. All the B symbols can be recovered from the data stored 
in any k of the total n nodes. Further, any node’s data may be exactly recovered by connecting to any d other nodes, 
and downloading one symbol from each. These d nodes are known as “helper nodes.” The symbols transferred from a 
helper node during node repair will be a linear function of the data stored in it. PM codes are storage-optimal and 
hence 


B = ka. 


( 1 ) 


The parameter a is induced by [n,k,d] as 


a = d — k -\-1. 


( 2 ) 


We now describe the construction of PM codes. In general, a PM code is described by an {n x d) encoding matrix T 
and a {d X a) message matrix M, yielding an (n x a) code matrix C defined by 

C := TM. (3) 

Let cf denote the row of the code matrix C. Then the i*^ node stores cf = tpfM. 

Here we review PM codes for d = {2k — 2), but the construction can applied to d > {2k — 2) by the shortening procedure 
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, which we review in Section 

VII-B 
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For d = {2k — 2), we have a = {d — k -\-1) = {k — 1). For these parameters, the encoding matrix T is of the form: 

T = [<& A<i)] (4) 

where 4> is an {n x a) matrix and A is an (n x n) diagonal matrix, with the following properties: 

^ Constructions without puncturing were subsequently shown in [26] and [27]. 

















(1) Any a rows of <i> are linearly independent 

(2) Any d rows of are linearly independent 

(3) The diagonal elements of A are all distinct. 


These requirements can be met, for example, by choosing 'll to be a Vandermonde matrix with elements chosen carefully 
to satisfy the third condition. 


We will now specify the structure of the message matrix M. Recall for d = {2k — 2), we have a = {k — 1), d = 2a, and 
B = ka = a{a + 1). The {d x a) message matrix M is constructed as 


M = 



( 5 ) 


where S'® and are {a x a) symmetric matrices. The matrices S“ and S^ together have precisely a{a + 1) distinct 
entries, which are now populated by the B = a{a + 1) message symbols. 

Let denote the row of 'f', and (j)f denote the row of $. Thus, under this encoding mechanism, node i {1 <i < n), 
stores the a symbols 

cf = i:fM = cl,fS‘^ + X.4>fS\ (6) 


Under this encoding, the data in any k nodes suffice to reconstruct the B = ka message symbols. The original paper 
[16] presents an explicit reconstruction algorithm for general PM codes, relying on Properties 1 and 3 above. 

PM codes allow repair of any failed node, by downloading one symbol from any d other helper nodes. For repairing 
node /, helper node i sends the single symbol 

cjckf = (7) 

Upon receiving d such helper symbols, failed node / will have where 4'^ is some d rows of 4*. It can then 

invert 'I'^; (by Property 2) to compute 

Mcjjf = 

And thus can recover its data as 

cj = + Xf^^S^ (9) 

(follows by symmetry of the matrices S'® and S^). 


S®</)/ 

SV/, 


( 8 ) 


B. Systematic Codes and Remapping 


It is often desirable to have the B original message symbols included in the encoded symbols (in uncoded form). Such 
codes are called systematic codes. Throughout the paper we consider systematic codes in which the first k nodes store 
the uncoded symbols. These nodes are thus referred to as “systematic nodes.” 


Any linear MDS erasure code can be generically transformed into a systematic code, as follows. First, any linear code 
taking B message symbols to na encoded symbols can be represented by an {na x B) generator matrix G, such that 
for a message-vector m of length B, the encoded na symbols are given by Gm. 


A code can be made systematic through a “systematic remapping”: Let Gk be a (R x B) matrix consisting of the first 
B rows of the original generator matrix G. To encode message m, first “remap” the message vector to m := G'j^^m, 
then encode as Gm. Consider the resulting first B encoded symbols: the message m is first transformed by G^^, then 
transformed by Gk during encoding. Therefore the first B encoded symbols are exactly the message symbols m, making 
the code systematic. Notice that the entire encoding operation now is equivalent to encoding the original message m 
with generator matrix Gsys '■= GG^^, which will have the first {B x B) block as identity by construction. 

Observe that the systematic remapping operation applies and hence can be thought of as decoding the message 
from the first k nodes under the original encoding with generator matrix G. 


The above transform can be applied to the vanilla PM codes discussed in Section II-A and 16 , to yield systematic 


PM codes. However, as shown in the examples below, applying systematic remapping to traditional PM codes often 
destroys their sparsity - leading to increased computational complexity. 








C. Notation 


We will use the concept of an inclusion map. In general, an inclusion map is a map which injectively embeds one space 
into another space, by simply changing representation (not performing any non-trivial transformation). For example, 
the following is an inclusion map from vectors of length 3 to symmetric (2 x 2) matrices: 

a 
b 
c 




a b 
b c 


Inclusion maps will be denoted by hooked arrows ('^) as above. 

For notational simplicity, we will often abuse notation by using the same symbols to denote a space as well as a vector 
in the space. For example, the systematic-remapping transformation of a message vector m, as in Section |II-B| will be 
written as a function f : m ^ rn. 

The {i,jy^ entry of a matrix M is denoted Mij. All vectors are column-vectors unless otherwise noted, and ^ denotes 
transpose throughout. 


III. Motivating Example 


A. Example 

To better understand the issues of sparsity and systematic remapping in PM codes, let us consider a particular 
[n = 8, /c = 4, d = 6] PM code. For these parameters, each node stores a = 3 symbols, and the number of message 
symbols is B = 12. Let {mo,... ,mii} denote these message symbols. Let us work in field Fn P| As described in 
Section [il-A[ we have: 
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Recall from Section [il-AI that node i stores the i-th. row of T times M, so the entire code is C = TM. 


As in Section II-B we can generically represent the encoding operation as an (na x B) = (24 x 12) generator matrix G 
times the message vector m, with entries mi. That is, we can “unwrap” the matrix-matrix multiplication C = ’^M 
into each of na = 24 encoded symbols. For example, the first a = 3 rows of G correspond to the 3 linear combinations 
stored by the first node: 


1 1 
0 1 
0 0 


1 0 
0 1 
1 0 


0 0 
1 0 
1 1 


1 1 
0 1 
0 0 


1 0 
0 1 
1 0 


0 0 
1 0 
1 1 


And the next 3 rows of G correspond to the 3 linear combinations stored by the second node: 

'2 480005 10 9 0 0 0‘ 

0204800 5 0 10 9 0 

0020480 0 5 0 10 9 


Notice that the submatrix of the generator matrix corresponding to each node is d-sparse, with the same sparsity 
pattern. The entire generator matrix and its sparsity pattern are as follows: 

■^This is the smallest prime field which will allow the PM construction of [l6| for this parameter regime. 
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This code is not systematic, since it does not contain the uncoded message symbols. To make it systematic, we perform 


the systematic remapping of Section II-B Let Gk be the {B x B) matrix consisting of the first B rows of the generator 
matrix G (above the line in @). The systematic generator matrix is Gsys = GG/. which in our case is: 
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Notably, the parity nodes are now almost entirely dense. 


B. Discussion 


As seen here, traditional PM codes begin sparse, but become dense after systematic-remapping. We may expect this, 
since the initial sparsity of PM codes comes from their product-matrix structure, but the systematic-remapping operates 
generieally on linear codes, not necessarily respecting the product-matrix structure. To address this, we need to understand 
the effect of systematic remapping on Product-Matrix codes. 

Traditionally, remapping is viewed as just decoding from the first k nodes, as discussed in Section |II-B[ In this case, 
understanding decoding is sufficient to understand systematic remapping. This is well-suited for classical codes, where 
the message-space and the code-space have the same structure. However, this is not true for Product-Matrix Codes. 


In Product-Matrix Codes, encoding takes a (structured) message-matrix M to a code-matrix G. In the example code 


M = 


And decoding the B = 12 message symbols from the first fc = 4 nodes is the inverse map G —>■ M, whose explicit 
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structure follows from the decoding algorithm in 16 . However, understanding the explicit structure of the decoding 


®In general, they will be entirely dense - the small sparsities here are incidental, due to small field size. 





















map does not immediately aid in understanding systematic-remapping. This is because remapping is most naturally 
viewed as a transformation between message-matrices M —> M. 

We address the above challenge by presenting a framework for understanding systematic remapping for product-matrix 
codes, and we further use this to construct PM codes which remain sparse after systematic remapping. An example of 
this construction is provided below. 


C. Sparse, Systematic PM Code 


In Sections |V] and |VII[ we present explicit constructions of sparse systematic PM codes. Here we show the code 
construction presented in Section instantiated for the same parameters as the example of Section III-A [n = 8, fc = 
4,d = 6]. 


The encoding matrix T' is chosen as: 




10 0 10 0 

0 1 0 0 8 0 

0 0 1 0 0 5 

4 5 4 3 1 3 

4 2 10 5 8 7 

3 10 9 10 4 8 

4 4 2 8 8 4 

10 3 1 5 7 6 


This yields the following (non-systematic) generator matrix: 
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1 
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3 
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After systematic-remapping, the final generator matrix is: 
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5 

0 1 

9 

6 

0 

2 

0 

0 

1 0 



* 


* 

* 
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* 

* 

* 
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( 12 ) 


Notice that in this case (compared to ([II|)) the sparsity is not lost in systematic-remapping: Each row of is still 
d = 6-sparse. 

















IV. First Step towards Sparsity in PM Codes 


In this section we analyze a simple family of encoding matrices dt which, after systematic remapping, results in codes 
with partial sparsity. The tools developed here will be useful in subsequent sections. 

Recall the structure of the encoding matrix for d = {2k — 2) PM codes from Q: 

ih = [<& A<i)]. 


Consider a d = {2k — 2) PM code in which the first row of <1’ is ei = 
the form 



Aci 

A'd>' 


[1 


0 


O]. 1^ Then the encoding matrix is of 

(13) 


We will now show that under such PM codes, the first symbol stored in every node is d-sparse after systematic remapping. 
Let dife denote the first k rows of di, that is, the encoding submatrix for the first k nodes. Then the first k nodes store 

Ck = ^kM. (14) 


Let fe-M^Ck denote the above encoding function for the first k nodes. We represent systematic remapping as a linear 
transformation between the original matrix M and the result ant m essage matrix after transformation 

M. After the remapping, the first k nodes become systematic (see Section II-B|. That is, the transformation fs is 
such that if we encode the first k nodes using message matrix M, we recover the original symbols of M in matrix Ck- 
Equivalently, for a systematic code, the entire encoding transform: 


M 


^ M ■ 


Ck 


(15) 


must act as an inclusion map M Ck- This inclusion map “unwraps” the symmetric matrices in M into one matrix 
Ck with distinct message symbols. 


To understand the interaction between the PM code and systematic remapping, we will define an explicit inclusion 
map /t, and decompose the remapping fs into two stages. We first represent the matrix M as the matrix Ck using the 
inclusion map f^, and then “decode” Ck into M using the decoding function /“^. Note that /e is invertible since it is an 
MDS encoding, wherein all message symbols can be decoded from any k nodes. The remapping transform thus becomes 


In other words. 


/s = /e ^ o /. 


fs--M 



M 


(16) 

(17) 


Thus the entire encoding transformation for the first k nodes becomes 


M^Ck^M^Ck 


(18) 


Notice that this makes the entire encoding transform M ^ Ck an inclusion map (equal to /t, in fact), thus resulting in 
a systematic code as desired. 


Remark 1. Any choice of inclusion map in (161 will yield a systematic remapping. However, as we will see, our particular 
choice of /i will be convenient for proving sparsity results. 


At a high level, the key ideas behind our approach for showing sparsity are as follows. 

(1) For our choices of and /t, the systematic remapping fs '- M —>■ M is such that the first column of M depends 
only on the first column of M (Lemma |^. 

(2) The first stored symbol in node i is the row of 41 times the first column of M. This depends only on the first 
column of M, and therefore (through fs) depends only on the first column of M- 

And Lemma [2] holds because: 

® Here we assume that such codes exist, and analyze their properties. Explicit constructions of such codes are presented in Section 


V-B 









(1) The “decoding”, f~^:Ck^M is such that the first column of M depends only on the first row and first column 
of Cfc (Lemma [^. 

(2) Our inclusion map fc'.M^Ck will be such that the symbols in the first row/column of Ck correspond exactly to 
the first column of M. 

The sparsity pattern of systematic remapping (Lemma is visualized below: 



fs-M 



> M 


We now consider each component of the systematic remapping transformation in detail, and then prove the sparsity of 
the entire encoding. 


A. The Triangular Inclusion Map 


Here we will define the inclusion map /t, termed the “triangular inclusion map.” Recall from Section |II-A| that the 


message matrix in d = {2k — 2) PM codes is of the form M = 
message symbols. 


S' 

S"’ 


, where 5"“ and are symmetric matrices of 


To map M ^ C/c by inclusion, place the upper-triangular half of S°‘ on the upper-triangular half of Ck, including the 
diagonal. Then place the lower-triangular half of on the lower-triangular half of Ck, excluding the diagonal. Finally, 
place the diagonal of on the last row of Ck- For example, consider a PM code with fc = 4, d = 6, for which a = 3 
and number of message-symbols B = 12. The triangular inclusion map in this case is: 


0 

1 

2 


1 

3 

4 


0 

1 

2 

2 

4 

5 

Ck = 

7 

3 

4 

6 

7 

8 

8 

10 

5 

7 

9 

10 


6 

9 

11 

8 

10 

11 






In the above matrices, numbers refer to symbol indices. Notice that symbols in column i (0<i<2) ofM correspond 
exactly to symbols in row i and column i oi Ck- 

B- The Inverse Map 

Here we will consider the structure of the inverse map f~^ : Ck ^ M, and show that it has a particular sparsity pattern. 

Lemma 1. In the inverse transform f~^ : Ck — t M, the first column of M depends only on the first row and first 
column ofCk- 

Proof: Let i/'i denote the first row of dt. In the encoding transform f^'-M^Ck, notice that the first row and first 
column of Ck depend only on the first column of M: 

• The first row of Ck is ifi times M. Since ifi = [ei Aei], this only involves symbols in the first row of 5”“ and first 
row of s’’- Or equivalently, the first column of M- 

• The first column of Cfe = TM clearly depends only on the first column of M- 































Therefore, we can consider the restriction of the map /e : M —>■ to symbols in the first column of M, and the first 
row/column of Ck- There are d symbols in both domain and co-domain. Further, this map is full-rank by construction, 
since it is a restriction of MDS encoding. Therefore this map is invertible, and the first column of M can be recovered 
from the first row/column of Cfc. □ 


C. Sparsity 


Here we combine the above maps and show that the entire encoding transform has a certain sparsity. The following 
lemma serves as our main tool. 

Lemma 2. In the systematic-remapping transform fs '■ M ^ M, a symbol in the first column of M only depends on 
symbols in the first column of M. 


Proof: From ( |17[ ), we write the remapping transform as M Ck ——>■ M. From Lemmathe 
depends only on the first row/column of Ck- And by the triangular inclusion map /t as defined in 
first row/column of Ck corresponds to the first column of M. 


first column of M 
the 


Section IV-A 


□ 


We can then show partial sparsity of the entire encoding: 

Theorem 1. Consider a d = {2k — 2) PM code in which the first row of ^ is ei = [l 0 ... O]. When this code is 

made systematic, the first symbol stored in every node is d-sparse. 


Proof: The first symbol of each node is a row of dt times the first column of M. But the first column of M depends 
only on the first column of M (by Lemmaj^, so the first symbol of each node is d-sparse w.r.t. symbols in M. Essentially, 
the sparsity occurs because the sparsity patterns of the following two transformations, restricted to the first column of 
M, are aligned: 

M ^ TM. (19) 

□ 

Remark 2. An analogous argument shows that if one of the first k rows of $ is e^, then the i-th symbol of every node is 
d-sparse. 

Remark 3. It may seem that the above sparsity argument only works with our particular inclusion map f^, but in fact 
it applies to any systematic remapping. Notice that the systematic remapping function is unique up to permutation of 
the ka message symbols. Therefore, if some final encoded symbol is a function of d original message symbols for a 
particular systematic-remapping function, it will remain a function of some d (permuted) message symbols in any other 
systematic-remapping. 


V. Explicit Sparse, Systematic PM Codes for d= {2k- 2) 

In this section, we first consider a particular design of encoding matrices ik, and prove that, after systematic remapping, 
they yield PM codes in which each encoded symbol is d-sparse. We then present explicit constructions of such matrices. 
Our analysis builds on the techniques presented in the previous section. 

In this section, for simplicity of notation, we will write the matrix Ck as simply C, so Ci j denotes the {ijjY^ entry of 

Ck- 


A- Design of the Encoding Matrix and Sparsity 


Consider a d = {2k — 2) PM code in which the first a rows of $ form an Identity matrix. In this case, the encoding 
matrix for the first k nodes is of the form: 



( 20 ) 


where r is an a-length vector. We will show under such an encoding matrix, after systematic remapping, every encoded 
symbol is d-sparse. 


From the properties of PM encoding matrices discussed in Section II-A we have: 


Property 1: The diagonal entries of A together with A are all distinct. 







• Property 2: All sub-matrices of [^t] are full-rank. In particular, all entries of are nonzero. 
The corresponding encoding transform for the first k nodes, is: 

C = 'HkM 


' I 

A ■ 

'S”' 

T 

r 

-1 

h 

_s\ 


5“ -k 

r'^S^ + Xr'^S\ 

'Cl' 

C 2 


(21) 

( 22 ) 

(23) 

(24) 


1) The Inverse Map: Here we describe how to recover the message matrices 5“ and S’’ from Ck, thus specifying the 
explicit structure of the inverse map /~^. 

First, all the non-diagonal entries of S’’, S’’ can be found by solving: 


r' . — 1 ?“. . j- \ .c:b _ . 

^l,J — 

C — 1?“ 4- \ 


(25) 


(Since Xi yf Xj by Property 1). 

For the diagonal entries, we first compute S^r and S’’r as follows. 

First define the following two vectors, which can be computed directly from C: 

Cl := Cir = S^r + AS’’r 
C 2 --C^ = SM + XS’’r 

Then S^r and S’’r can be computed from: 

5“r = (A-A/)-^(Ac 2 -Aci) 

^V = (A-A/)-1(ci-C2) 


(26) 

(27) 


(28) 

(29) 


where the diagonal matrix (A — XI) is invertible by Property 1. 

Now we compute the f-th diagonal entry of S” from S^r. Let S^i denote row i of S”. After computing S^r as above, 
we can extract S^ir = S^i^Vj. Then S°-i^i can be computed as: 

S\,, = {S\r-Y,S\,jr,)/r, (30) 

Notice that the non-diagonal elements are known, and yf 0 by Property 2. The diagonal elements of S’’ can 

be recovered similarly from S’’r. 

2) Sparsity: Using the structure of the inverse map described above, together with the triangular inclusion map defined 
in Section [IV-AI we will show that the entire encoding transform is d-sparse. 

Analogous to Lemma we first show that the systematic-remapping transform has a certain sparsity. 

Lemma 3. In the systematic-remapping transform fs '■ M ^ M, the symbol Mij only depends on symbols in column j 
of M. 


Proof: First notice that the sparsity pattern of /g in recovering S” from C, is as follows: 


Non-diagonal element S^tj depends on elements Cij and Qy, as in (25l. 

Diagonal element S'“iy depends on row i and column i of C. To see this, first compute all non-diagonal entries 
from (25) using row i and column i of Ci. Then compute the f-th component of S^r from (28), using the 
j-th entry of ci and C 2 . Finally, compute S^i^i from (30). 


And the same sparsity holds for recovering S’’ from C as well. 















Now let M = 




. We will show that symbol depends only on S°‘ij and and a symmetric argument holds 


— /-I _ 

for S^ij. Writing the systematic-remapping as M '—> C ——> M, there are two cases: 


Non-diagonal element S‘^ij depends on Cij and Cj^i which, by our inclusion map, correspond to S°’ij and S°ij. 
Diagonal element depend on row j and column j of C, which correspond to column j of M. 


□ 


This allows us to show sparsity of the entire encoding. 

Theorem 2. Consider a d = (2k — 2) PM code in which the first a rows of $ form an Identity matrix. When this code 
is made systematic, each encoded symbol is d-sparse. 

Proof: Each encoded symbol is a row of 'P times a column of M, by the PM encoding of ([^. But each column of 
M depends only the corresponding column of M (by Lemma [^. Thus we conclude the final encoding 'PM is d-sparse 
w.r.t. symbols of M, since the two maps have aligned sparsity patterns: 

M ^ TM. (31) 

□ 


B. Explicit Construction 


We now present explicit constructions of matrices which conform to the design of Section V-A This yields explicit 
systematic d = {2k — 2) PM codes in which each encoded symbol is d-sparse. 

Theorem 3. Let 'P = [<i> A<P] be the encoding matrix for a d = {2k — 2) PM code, satisfying the properties mentioned 
in Section II-A For example, we can let be a Vandermonde matrix, as given in Um- Let the {a x a) matrix <Pq, denote 


(32) 


the first a rows o/<P. Then the following encoding matrix: 

vp'= A<P<P“i] := [<P' A<P']. 

defines a d = {2k — 2) PM code in which, after systematic remapping, each encoded symbol is d-sparse. 


Proof: The matrix 'P' satisfies the properties of Section II-A since multiplication by full-rank will not destroy 
the rank of any submatrices of the original encoding matrix Therefore 'P' satisfies all properties of a PM encoding 
matrix. Further, the first a rows of are the identity. So by Theorem]^ we conclude that after systematic-remapping, 
this code will remain d-sparse. □ 


In other words, if we represent the encoding procedure for this systematic code as a {na x B) generator matrix G 
mapping B message symbols to na encoded symbols {a per node), then each row of G will be d-sparse. 


VI. Sparsity in Systematic MSR Codes from Repair-By-Transfer 

Sections |IV| and [V| dealt with constructing sparse systematic PM codes. In this section, we consider sparsity in more 
general systematic regenerating codes. 


A. Background: MSR Codes and Repair-by-Transfer 


An [n,k,d]{a, j3) regenerating code allows the message to be stored across n nodes, each storing a encoded symbols. 
All the B symbols can be recovered from the data stored in any k of the total n nodes. Further, any node’s data may 
be exactly recovered by connecting to any d other nodes, and downloading fi < a symbols from each. The symbols 
transferred from a helper node during node repair may in general be some arbitrary function of the data stored in it. 

Minimum-Storage-Regenerating (MSR) codes are regenerating codes which are also MDS, and therefore satisfy 


B = ka. 


(33) 


For example, an [n, k, d] PM code is an [n,k,d]{a = d — k -\-1, (3 = I) MSR code. The seminal work by Dimakis et. al. 


17 shows that for MSR codes, the parameters above must necessarily satisfy 


a = /3(d — A: -I- 1). 


(34) 











During a node-repair operation, a helper node is said to perform repair-by-transfer (RBT) if it does not perform any 
computation and merely transfers one of its a stored symbols to the failed node. We say a linear [n, fc,d](a,/3 = 1) 
MSR code supports RBT with the RBT-SYS pattern if every node can help the first a nodes via RBT. 


B. Sparsity from Repair-hy-Transfer 

We now present a general connection between sparsity and repair-by-transfer, by showing that an MSR code with a 
certain RBT property must necessarily be sparse. 

Let C be a linear systematic MSR [n, fc, d](a,/3 = 1) code of blocklength B = ka, with (na x B) generator matrix G. 
Let be the ( a X B) submatrix corresponding to the z-th node. 

Theorem 4. If C supports repair of a systematic node v via RBT with helper nodes comprising the remaining {k — 1) 
systematic nodes and d — {k — 1) = a other parity nodes, then for each parity i, the corresponding generator-submatrix 
G^®^ has one row with sparsity < d. 

In particular, the row of G*-®^ corresponding to the symbol transferred for the repair of node v is supported on at most the 
following coordinates. 

• The a coordinates corresponding to symbols stored by node v. 

• For each of the other {k — 1) participating systematic nodes p, ^ v: one coordinate corresponding to a symbol stored 
by node p. 


Proof: Say systematic node 0 fails, and is repaired via RBT by the {k — 1) other systematic nodes, and a other 
parity nodes. Each helper will send one of its a stored symbols. For the systematic helpers, these symbols correspond 
directly to message symbols - let S be the set of these message symbol indices. Notice that S is disjoint from the 
symbols that node 0 stores. For the parity helpers, each transferred symbol is a linear combination of message symbols. 
We claim that these linear combinations cannot be supported on more than the message symbols that node 0 stores, 
and the set S. That is, in total the support size can be at most a -\- {k — 1) = d. 


Intuitively, Theorem holds because the symbols from systematic helpers can only “cancel interference” in {k — 1) 
coordinates (of S), and the a parity helpers must allow the repair of node O’s a coordinates, and thus cannot contain 
more interference. This concept of interference-alignment is made precise in 28 , and our Theorem follows as a 
corollary of “Property 2 (Necessity of Interference Alignment)” proved in Section VI.D of 


28 


□ 


Theorem 5. If C supports the RBT-SYS pattern, then for each parity i, the corresponding generator-submatrix G*-®^ 
has min(a,fc) rows that are d-sparse. In particular, if d < {2k — 1), then all rows of G are d-sparse. 


Proof: In the RBT-SYS pattern, each parity node i helps the first a nodes via RBT, including min(a, k) systematic 
nodes. In each repair of a systematic node, the row of G^®^ corresponding to the RBT symbol sent is d-sparse (by 
Theorem 1). This is true for each of the symbols sent to systematic nodes. These transferred symbols correspond to 
distinct symbols stored in node i, by Property 3, Section 6 of 
independent. Therefore, min(a, k) rows of G^®^ are d-sparse. 

In particular, for an MSR code, d < {2k — 1) implies a < k, so all rows of G are d-sparse in this regime. □ 


28 , which states that these symbols must be linearly 


VII. Explicit Sparse, Systematic PM Codes for d> {2k- 2) 


Section m provides a strong connection between repair-by-transfer and sparsity in systematic MSR codes. This 
connection allows us to construct explicit sparse PM codes for d > {2k — 2). First we review how to construct systematic 
PM codes which support the RBT-SYS pattern, from 21 . We then review the notion of code shortening for PM codes, 
from 16 . We apply these tools with the results of Section VI to present explicit systematic PM codes in which all 


encoded symbols are d-sparse. 


A. Repair-By-Transfer (RBT) for PM Codes 

Recall that in a code that supports the RBT-SYS pattern, if any of the first a nodes fail, every remaining node can 
help it by simply transferring one of its stored symbols. 








For any d > {2k — 2), let C = '^M be the code matrix of a PM code C. Recall from Section that node i stores a row 

cf = ijjfM. To help repair node /, node i sends cjfif, for some repair vector /r/. In helping the first a nodes, node i 

would thus send the a symbols cf P where 

P=[fii ■■■ /x„]. (35) 

Define the RBT-transformed code C as the code C where the data in each node is transformed by P: node i now stores 
cf P. Hence the encoding procedure for C' results in the code matrix 

C' = CP = 'UMP. (36) 

Notice that if P is invertible, then C shares the same MDS and repair properties as C. Additionally, in C, node i can 
help repair any of the first a nodes (say, node j) by simply transferring its symbol: cf ^j. 

For d = {2k — 2) PM codes, as reviewed in Section]^ the matrix P = which is invertible by construction. 


B. Code Shortening 


The notion of code shortening allows us to construct d > {2k — 2) PM codes from a class of d = {2k — 2) PM codes. 
Here we describe the PM code shortening of 16 , stated in terms of generator matrices. 


For a generator matrix G', consider the submatrix G obtained by omitting the first t rows and first t columns of G'. 
We refer to the code defined by G as the code G', shortened by the first t symbols. 


An [n, k,d > 2k — 2] PM code can be constructed by simply shortening an [n', k', d' = {2k' — 2)] PM code, as follows. 

Lemma 4. (From Theorem 6 of & For any [n, k,d > 2k — 2], let G' be the generator matrix of an [n' = n + i, k' = 
k-\-i,d' = d-\-i = {2k' — 2)\{a, jS) systematic PM code, where i:= d— {2k —2). Let G be the submatrix of G' obtained by 
omitting the first ia rows and first ia columns. Then G defines a systematic [n,k,d]{a, (3) PM code. 


Proof: Informally, restricting to a submatrix as above can be thought of as considering the subcode of G' in which 
the first i nodes store all 0-symbols. (Or equivalently, where the first ia message symbols are all 0). The regeneration 
and repair properties of G' still hold in G with i less helpers {k = k' — i, d = d' — i) since the first i “dummy nodes” of 
G' can be assumed to always send 0 when participating in regeneration or repair. Further, this new code still operates 
at the MSR point, since the number of message symbols is k'a — ia = ka. 


Formally, the statement follows directly from Theorem 6 and Corollary 8 of 16 


□ 


C. Explicit Construction 

Sparse systematic d > {2k — 2) MSR codes can be constructed by RBT-transforming a. d' = {2k' — 2) PM code, and 
then shortening appropriately. The following theorem presents this result. 

Theorem 6. Consider a [n,k,d > (2fc — 2)] systematic PM code C constructed by shortening a [n' = n + i,k' = k + i,d' = 
{2k' — 2)] systematic PM code C that supports RBT-SYS, where i := {d — {2k — 2)). Let G denote the generator matrix 
for code C. Letting G^3) denote the {a x ka) submatrix of G for node j, the following sparsity holds for all nodes j. 

• The first {d — 2k + 2) rows of G^l'> are k-sparse. 

• The remaining {k — 1) rows of G^^^ are d-sparse. 


Proof: By Lemmathe shortened generator matrix G defines an [n,k,d]{a, (3) linear systematic MSR code. The 
sparsity of G follows from applying Theorem]^ to the code G'. In particular, the first ia columns of G' are omitted in 
G. In the code G', these columns correspond to symbols in the first i systematic nodes - we interchangeably denote 
these columns/nodes by set N. 

Consider a row of G' corresponding to a symbol transferred for the repair (via RBT) of some systematic node v G N. 
By Theorem the restriction of this row to columns outside N must be fc-sparse, since it can only be supported on 
one symbol per systematic node p, ^ N. There must be |iV| = i = {d—2k + 2) such rows per G^^'> since the code G' 
supports RBT-SYS, and symbols transferred from a given node for the repair of two different nodes must be linearly 
independent (in d' = {2k' — 2) PM codes) by Property 3, Section 6 of [^ . 

Now consider a row of G' corresponding to a symbol transferred for the repair (via RBT) of some systematic node 
v ^ N. By Theorem]^ the restriction of this row to columns outside N must be (a -I- fc — 1) = d-sparse, since it can 




only be supported on the a symbols of v plus one symbol per remaining systematic node /i ^ -/V, /r ^ v. This comprises 
the remaining rows of each similarly by the RBT-SYS property and Property 3, Section 6 of 
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Remark 4. It is interesting to note that the sparsity provided by the codes of Theorem]^ is greater than the sparsity 
guaranteed by a generic [n, k,d> (2A: —2)](a, /3 = 1) linear systematic MSR code that supports RBT-SYS. By Theorem]^ 
such a code would be such that the first k symbols stored in every node are d-sparse, while the remaining symbols may 
be dense. Q 


VIII. Equivalence in Sparse Systematic PM Code Constructions 


The previous sections present two different ways of a constructing sparse d = {2k — 2) PM code from a given d = {2k — 2) 
PM code: 


(1) Apply the RBT-transformation of Section VII-A to yield a code that is sparse (by Theorem]^. 

(2) Transform the encoding matrix 4> to contain an identity block, as in Equation (321 of Theorem]^ 


Interestingly, it turns out that these two constructions are equivalent up to a transform termed symbol-remapping^ 
which is defined below. 


Symbol-remapping is defined as any invertible transformation on the message-space of a code. For example, systematic- 
remapping is a special case of symbol-remapping for achieving systematic codes. Two codes with encoding functions fi 
and /2 are equivalent up to symbol-remapping if 

/i = /20T (37) 

for some invertible transform T. 

Theorem 7. For a given d = {2k —2) PM code C with encoding matrix di = [4> A4>], consider a related code C wherein 

the data in each node is further transformed by an invertible linear transformation P. That is, the entire encoding 
operation is C = '^MP. Then C is equivalent to a PM code with the below encoding matrix 4/' up to symbol-remapping. 

T' := Ad'P-'^] (38) 


Proof: 


Consider transforming each message-submatrix and by 

S<^ -).S^ := P-^S‘^P-^ 


(39) 


Notice that this transformation is invertible and preserves symmetry, so it is a symbol-remapping on the message-space 
of PM codes. 


If we then encode C' using message matrices 5"“ and S^, the entire encoding operation will be: 



5“' 

P = T 

S'“P 

^bp 



= T 

P-^, 


= A4>P-^] 



= T' 



(40) 

(41) 

(42) 

(43) 


The above form is native PM encoding with the original message matrix M 

T' := A4>P-^] . 




and the new encoding matrix 

□ 


Notice that if P is chosen to support RBT-SYS (as in Section VII-A), then P"^ will be the first a rows of 4), and the 
encoding matrix 

T' = [4)P“^ A4)P“^] = [4)<i)“i A4)4)“^] (44) 


^ It turns out that the unified PM codes presented in [26| also have a certain degree of inherent sparsity, although not as sparse as the 
codes of Theorem]^ It can be shown using an inclusion map argument that the codes o f [26] , in systematic form, have the following sparsity 
pattern: the last {a — k) symbols stored in every node are k-sparse. Interestingly, the RBT-transformed version of these codes have essentially 
the complementary sparsity pattern (by the present remark). 

















is identical to the encoding matrix (321 of the explicit sparse codes of Theorem 

Thus these two methods of constructing sparse codes are equivalent up to symbol-remapping. 
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